10:16
2026-04-27
ianbarber.blog
large-language-models
Loss Exploded.
Meta's FAIR team documented a series of training failures in 2021 for their OPT-175B model, including repeated loss explosions and learning issues that required extensive hyperparameter tuning and arcβ¦